智能论文笔记

Learning-based Predictive Path Following Control for Nonlinear Systems Under Uncertain Disturbances

Rui Yang , Lei Zheng , Jiesen Pan , Hui Cheng

分类：机器人

2022-12-26

Accurate path following is challenging for autonomous robots operating in uncertain environments. Adaptive and predictive control strategies are crucial for a nonlinear robotic system to achieve high-performance path following control. In this paper, we propose a novel learning-based predictive control scheme that couples a high-level model predictive path following controller (MPFC) with a low-level learning-based feedback linearization controller (LB-FBLC) for nonlinear systems under uncertain disturbances. The low-level LB-FBLC utilizes Gaussian Processes to learn the uncertain environmental disturbances online and tracks the reference state accurately with a probabilistic stability guarantee. Meanwhile, the high-level MPFC exploits the linearized system model augmented with a virtual linear path dynamics model to optimize the evolution of path reference targets, and provides the reference states and controls for the low-level LB-FBLC. Simulation results illustrate the effectiveness of the proposed control strategy on a quadrotor path following task under unknown wind disturbances.

translated by 谷歌翻译

SASFormer: Transformers for Sparsely Annotated Semantic Segmentation

Hui Su , Yue Ye , Wei Hua , Lechao Cheng , Mingli Song

分类：计算机视觉

2022-12-05

Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer, that achieves remarkable performance. Specifically, the framework first generates hierarchical patch attention maps, which are then multiplied by the network predictions to produce correlated regions separated by valid labels. Besides, we also introduce the affinity loss to ensure consistency between the features of correlation results and network predictions. Extensive experiments showcase that our proposed approach is superior to existing methods and achieves cutting-edge performance. The source code is available at \url{https://github.com/su-hui-zz/SASFormer}.

translated by 谷歌翻译

xTrimoABFold: Improving Antibody Structure Prediction without Multiple Sequence Alignments

Yining Wang , Xumeng Gong , Shaochuan Li , Bing Yang , YiWu Sun , Chuan Shi , Hui Li , Yangang Wang , Cheng Yang , Le Song

分类：人工智能

2022-11-30

In the field of antibody engineering, an essential task is to design a novel antibody whose paratopes bind to a specific antigen with correct epitopes. Understanding antibody structure and its paratope can facilitate a mechanistic understanding of its function. Therefore, antibody structure prediction from its sequence alone has always been a highly valuable problem for de novo antibody design. AlphaFold2, a breakthrough in the field of structural biology, provides a solution to predict protein structure based on protein sequences and computationally expensive coevolutionary multiple sequence alignments (MSAs). However, the computational efficiency and undesirable prediction accuracy of antibodies, especially on the complementarity-determining regions (CDRs) of antibodies limit their applications in the industrially high-throughput drug design. To learn an informative representation of antibodies, we employed a deep antibody language model (ALM) on curated sequences from the observed antibody space database via a transformer model. We also developed a novel model named xTrimoABFold to predict antibody structure from antibody sequence based on the pretrained ALM as well as efficient evoformers and structural modules. The model was trained end-to-end on the antibody structures in PDB by minimizing the ensemble loss of domain-specific focal loss on CDR and the frame-aligned point loss. xTrimoABFold outperforms AlphaFold2 and other protein language model based SOTAs, e.g., OmegaFold, HelixFold-Single, and IgFold with a large significant margin (30+\% improvement on RMSD) while performing 151 times faster than AlphaFold2. To the best of our knowledge, xTrimoABFold achieved state-of-the-art antibody structure prediction. Its improvement in both accuracy and efficiency makes it a valuable tool for de novo antibody design and could make further improvements in immuno-theory.

translated by 谷歌翻译

Localizing Anatomical Landmarks in Ocular Images using Zoom-In Attentive Networks

Xiaofeng Lei , Shaohua Li , Xinxing Xu , Huazhu Fu , Yong Liu , Yih-Chung Tham , Yangqin Feng , Mingrui Tan , Yanyu Xu , Jocelyn Hui Lin Goh

分类：计算机视觉 | 机器学习

2022-09-25

Localizing anatomical landmarks are important tasks in medical image analysis. However, the landmarks to be localized often lack prominent visual features. Their locations are elusive and easily confused with the background, and thus precise localization highly depends on the context formed by their surrounding areas. In addition, the required precision is usually higher than segmentation and object detection tasks. Therefore, localization has its unique challenges different from segmentation or detection. In this paper, we propose a zoom-in attentive network (ZIAN) for anatomical landmark localization in ocular images. First, a coarse-to-fine, or "zoom-in" strategy is utilized to learn the contextualized features in different scales. Then, an attentive fusion module is adopted to aggregate multi-scale features, which consists of 1) a co-attention network with a multiple regions-of-interest (ROIs) scheme that learns complementary features from the multiple ROIs, 2) an attention-based fusion module which integrates the multi-ROIs features and non-ROI features. We evaluated ZIAN on two open challenge tasks, i.e., the fovea localization in fundus images and scleral spur localization in AS-OCT images. Experiments show that ZIAN achieves promising performances and outperforms state-of-the-art localization methods. The source code and trained models of ZIAN are available at https://github.com/leixiaofeng-astar/OMIA9-ZIAN.

translated by 谷歌翻译

Lamarckian Platform: Pushing the Boundaries of Evolutionary Reinforcement Learning towards Asynchronous Commercial Games

Hui Bai , Ruimin Shen , Yue Lin , Botian Xu , Ran Cheng

分类：机器学习 | 人工智能 | 神经与进化计算

2022-09-21

尽管将进化计算整合到增强学习中的新进展，但缺乏高性能平台可赋予合成性和大规模的并行性，这对与异步商业游戏相关的研究和应用造成了非平凡的困难。在这里，我们介绍了Lamarckian-一个开源平台，其支持进化增强学习可扩展到分布式计算资源的支持。为了提高训练速度和数据效率，拉马克人采用了优化的通信方法和异步进化增强学习工作流程。为了满足商业游戏和各种方法对异步界面的需求，Lamarckian量身定制了异步的马尔可夫决策过程界面，并设计了带有脱钩模块的面向对象的软件体系结构。与最先进的RLLIB相比，我们从经验上证明了Lamarckian在基准测试中具有多达6000 CPU核心的独特优势：i）i）在Google足球游戏上运行PPO时，采样效率和训练速度都翻了一番； ii）在乒乓球比赛中运行PBT+PPO时，训练速度的速度快13倍。此外，我们还提出了两种用例：i）如何将拉马克安应用于生成行为多样性游戏AI； ii）Lamarckian如何应用于游戏平衡测试的异步商业游戏。

translated by 谷歌翻译

Volumetric-based Contact Point Detection for 7-DoF Grasping

Junhao Cai , Jingcheng Su , Zida Zhou , Hui Cheng , Qifeng Chen , Michael Y Wang

分类：机器人

2022-09-14

在本文中，我们提出了一条基于截短的签名距离函数（TSDF）体积的接触点检测的新型抓紧管道，以实现闭环7度自由度（7-DOF）在杂物环境上抓住。我们方法的关键方面是1）提议的管道以多视图融合，接触点采样和评估以及碰撞检查，可提供可靠且无碰撞的7-DOF抓手姿势，并带有真实的碰撞 - 时间性能；2）基于接触的姿势表示有效地消除了基于正常方法的歧义，从而提供了更精确和灵活的解决方案。广泛的模拟和实体机器人实验表明，在模拟和物理场景中，就掌握成功率而言，提出的管道可以选择更多的反物和稳定的抓握姿势，并优于基于正常的基线。

translated by 谷歌翻译

A Secure and Efficient Multi-Object Grasping Detection Approach for Robotic Arms

Hui Wang , Jieren Cheng , Yichen Xu , Sirui Ni , Zaijia Yang , Jiangpeng Li

分类：机器人 | 人工智能

2022-09-08

机器人武器广泛用于自动行业。但是，随着在机器人臂中深入学习的广泛应用，存在新的挑战，例如分配掌握计算能力和对安全性的需求不断增长。在这项工作中，我们提出了一种基于深度学习和边缘云协作的机器人手臂抓握方法。这种方法意识到了机器人组的任意掌握计划，并考虑了掌握效率和信息安全性。此外，由GAN训练的编码器和解码器使图像在压缩时可以加密，从而确保隐私的安全性。该模型在OCID数据集上达到92％的精度，图像压缩比达到0.03％，结构差值高于0.91。

translated by 谷歌翻译

Re-Attention Transformer for Weakly Supervised Object Localization

Hui Su , Yue Ye , Zhiwei Chen , Mingli Song , Lechao Cheng

分类：计算机视觉

2022-08-03

弱监督的对象本地化是一项具有挑战性的任务，旨在将对象定位具有粗糙注释（例如图像类别）。现有的深网方法主要基于类激活图，该图的重点是突出显示歧视性局部区域，同时忽略了整个对象。此外，基于变压器的技术不断地重点放在阻碍识别完整对象的能力的背景上。为了解决这些问题，我们提出了一种称为令牌改进变压器（TRT）的重新注意事项机制，该机制捕获了对象级语义，以很好地指导本地化。具体而言，TRT引入了一个名为令牌优先级评分模块（TPSM）的新型模块，以抑制背景噪声的效果，同时重点放在目标对象上。然后，我们将类激活图作为语义意识的输入合并，以将注意力图限制为目标对象。在两个基准测试上进行的广泛实验展示了我们提出的方法与现有方法的优势，该方法具有带有图像类别注释的现有方法。源代码可在\ url {https://github.com/su-hui-zz/reattentiontransformer}中获得。

translated by 谷歌翻译

Federated Deep Reinforcement Learning for RIS-Assisted Indoor Multi-Robot Communication Systems

Ruyu Luo , Wanli Ni , Hui Tian , Julian Cheng

分类：机器人

2022-07-17

室内多机器人通信面临两个关键挑战：一个是由堵塞（例如墙壁）引起的严重信号强度降解，另一个是由机器人移动性引起的动态环境。为了解决这些问题，我们考虑可重构的智能表面（RIS）来克服信号阻塞并协助多个机器人之间的轨迹设计。同时，采用了非正交的多重访问（NOMA）来应对频谱的稀缺并增强机器人的连通性。考虑到机器人的电池能力有限，我们旨在通过共同优化接入点（AP）的发射功率，RIS的相移和机器人的轨迹来最大化能源效率。开发了一种新颖的联邦深入强化学习（F-DRL）方法，以通过一个动态的长期目标解决这个具有挑战性的问题。通过每个机器人规划其路径和下行链路功率，AP只需要确定RIS的相移，这可以大大保存由于训练维度降低而导致的计算开销。仿真结果揭示了以下发现：i）与集中式DRL相比，提出的F-DRL可以减少至少86％的收敛时间； ii）设计的算法可以适应越来越多的机器人； iii）与传统的基于OMA的基准相比，NOMA增强方案可以实现更高的能源效率。

translated by 谷歌翻译

DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding

Liang Qiao , Hui Jiang , Ying Chen , Can Li , Pengfei Li , Zaisheng Li , Baorui Zou , Dashan Guo , Yingda Xu , Yunlu Xu

分类：计算机视觉

2022-07-14

本文介绍了Davarocr，这是一种用于OCR和文档理解任务的开源工具箱。Davarocr目前实施19种高级算法，涵盖9个不同的任务表。Davarocr为每种算法提供了详细的用法说明和经过训练的模型。与以前的OpenSource OCR工具箱相比，Davarocr对文档理解的尖端技术的子任务具有相对完整的支持。为了促进OCR技术在学术界和行业中的开发和应用，我们更加关注使用不同的技术可以共享的模块的使用。Davarocr在https://github.com/hikopensource/davar-lab-ocr上公开发行。

translated by 谷歌翻译